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Abstract. Accordingto application needs, the spatial data issued from dig- 
itizing operation, or imported from other formats shall undergo specific 
operations prior to any use, in order to preserve the spatial data consisten- 
cy, the component of data quality considered as an indispensable part in an 
I SO metadata model . On this background, the contribution of our work falls 
in the field of spatial data quality improvement. It involves applying a mul- 
ti-stage methodology for detecting and correction errors. 
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1. Introduction 

Geographic information in digital form is provided generally in insufficient 
quantity and in various structures, which leads to difficulties in exploiting 
such data. Addressing this deficiency requires the collecting of all available 
data and spending a lot of efforts to integrate and standardize them. The 
main purpose for providing topological information in geographic infor- 
mation systems(GIS) isto improve spatial analysis capabilities (Egenhoffer 
1989, Herring 1989). 

Several main components of spatial data quality were identified by interna- 
tional standardization bodies such as ISO/TC 211 OGC and FGDC, which 
consists of seven usual quality elements: lineage, positional accuracy, at- 
tribute accuracy, semantic accuracy, temporal accuracy, logical consistency 
and completeness (Wang F. 2008). 

Our work focuses on the data consistency issue of the spatial data quality 
components, which involves the logical consistency. Due to complex geo- 
graphic data characteristics, various data capture workflows and different 
data sources, the final large datasets often result in inconsistency, incom- 
pleteness and inaccuracy. To reduce spatial data inconsistency and provide 
users the data of adequate quality, the specification of spatial data con- 
si sten cy r eq u i rements shou I d be expl icitly descri bed . 



The definition of spatial integrity constraints or rules isone of the solutions 
used by current approaches for specifying data consistency requirements. 
Nevertheless, those existing approaches are not well structured or not suffi- 
cient to deliver all user needed contents. Consequently the complex con- 
tents make it difficult to understand the defi ned requi rements. 



2. Vector data structures 

The basic element in the vector data is the point. Points create lines and set 
of lines creates polygon. To represent the spatial features in vector data 
structure, i niti al ly coordi nate pai rs that make up those vectors are stored i n 
digital form. Vector data structures in accord with spatial analysis can be 
di vi ded i nto two basi c cl asses: 

1 Non-topological data structure. 
2. Topological data structure. 

2.1. Non-topological data structure 

This kind of structure is called spaghetti data structure. Three basic geo- 
metric forms (point, line, polygon) are used to represent spatial features. 
Features that are represented by point geometric shape are zero- 
dimensional elements and each one is defined by one coordi nate pair (x, y). 
Line-shaped feature are one dimensional elements defined by (x, y) coordi- 
nate series that follow each other. Polygon-shaped features are defined as 
two-di mensi onal cl osed shapes that are formed by I i nes starti ng and endi ng 
at the same point. 

Problems that prevent the spatial data analysis due to the non- 
topological structured data usually obtained as a result of the digitiza- 
tion process are (Pequet and Marble, 1990): 

a) Line-Lineintersection: point features at intersection of lines 

b) Polygon features are not closed properly. 

c) Impossibility of determination of neighborhood relations 

d) Contact points (or extremities) do not coincide 

e) Unicity of graphic objects is not assured. Overlaps or gaps in polygon 
features do happen. 

f) Objects point, line or polygon included in a polygon are almost undetect- 
able. 

g) Impossibility of navigation si nee there is no direction concept in the line 
features or compute paths and trajectories. 

2.2. Topological data structure 

In GIS, topology is the geometric relationship between edges, nodes and 
the faces they created. According to the other definition, topology is a 



way or method in which logical relations can be defined such as neigh- 
borhood, coincidence, inclusion, intersection, sharing, in addition to metric 
relationships such as the geometrically identifiable coordinate, length, area 
(Bank, 1997). 

To be able to evaluate a topological database, in addition to the geometric 
properties the foil owing relationships must be determined and stored: 

a) Edges making up the boundaries of each polygon (polygon topology ta- 
ble); 

b) Neighborhood relations between the polygons (edge topology table); 

c) Connections at the intersection points (node topology table); 

d) Start and end points of edges (edge-coordinate data table). 

The topologic data structure is often referred to as an intelligent data struc- 
ture because spatial relationships between geographic features are easily 
derived when using them. Primarily for this reason the topologic model is 
the dominant vector data structure currently used in Gl S technology. Many 
of the complex data analysis functions cannot effectively be undertaken 
without a topologic vector data structure (Buckley, 1998). 



3. Topological errors in spatial vector data 

Such error in the data occurs in most cases when converting data in a topo- 
logical structure. They stem from the original quality of the source data and 
the characteristics of the data capture process. Usually the data are cap- 
tured by scanning. Scanning allows a user to retrieve spatial data from a 
paper product, for example, a map, and recorded by the computer software. 
Most Gl S software has utilities to clean the data and construct a topological 
structure. Interactive editing of data is a distinct reality in the data input 
process (Suleyman, 2010). The most common topological error types in 
spatial vector data: 

L F I oati ng or short I i nes 

2. Overlapping lines 

3. Overshoots and undershoots 



4. Unclosed and weird polygons 
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Figure L Interface choices geometric properties for objects Point, Line, Region, 
Networks and Tessellations. 




Contact points error isolated line two buildings overlap 

Figure 2. Examples of errors point, line and surface (contact point, isolated line, 
regions overlap) 



Figure 3. Example of atypical error of digitization, the correction is made interac- 
tively 

4. Conclusion 

The techniques used demonstrate the effectiveness of the approach, and 
contribute to the improvement of the internal consistency of an existing 
database in vector format. I ndeed, this is done by defining the search and 
correction procedures of spatial errors that can be integrated with several 
existing databases. 

We are interested in problems of consistency si nee we work on existing da- 
tabases without any other source of information. It is indeed the only com- 
ponent common to elements relating spatial quality does not require any 
comparison with another source considered more consistent and is usually 
referred to as the nomi nal terrain. 

Our work advocated corrections of errors detected in spatial databases. It is 
advisable to ensure the proper consistency of stored data, which requires a 
good representation of reality. We believe that preserving the consistency of 
the geographic database means both, check the geometry and validate the 
topology. 

The topology is a powerful tool for advanced spatial queries; a system with 
this feature ensures a good consistency data. 
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